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Vibrio cholerae, a Gram-negative pathogen autochthonous to the aquatic environment, is the causative agent of cholera. Here, 
we report the complete genome sequence of V. cholerae G4222, a clinical isolate from South Africa. 
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1 lihrio cholerae is the causative agent of cholera, a severe diar- 
If rheal disease that remains a socioeconomic burden in many 
developing countries (1). As such, the species, also widely re- 
garded as a model organism for studies pertaining to water-borne 
pathogens, has received much attention (2). Among other foci, 
emphasis has been placed on defining serotypes and biotypes in an 
attempt to understand this rapidly evolving pathogen (3). How- 
ever, with the discoveries of intraspecies serogroup transfer and 
contradicting biotype-related phenotypes, the focus has shifted 
away from classic diagnostic markers to genome-based compari- 
sons with an emphasis on mobile elements (3, 4). To this end, the 
genome sequence of Y. cholerae G4222, a clinical Ol isolate ob- 
tained in South Africa during the 2000-2001 epidemic, was deter- 
mined. This represents the first genome of a South African Y. chol- 
erae strain. In light of the dearth of African Y. cholerae strain 
sequences and the recent discovery of new recombinant Y. chol- 
erae biotypes in southern Africa, this genome might provide valu- 
able insights into the evolution of this pathogen (5, 6). Further- 
more, the genome sequence of this strain may contribute to the 
understanding of the Y. cholerae mobilome. 

The genome was sequenced using a Roche 454 GS-FLX se- 
quencer at Inqaba Biotec, South Africa. A total of 201,286 reads 
with an average read length of 236 bp were obtained, giving a total 
of 47,503,496 nucleotides and genome coverage of 11.6 X. The 
reads were assembled into 280 contigs using Newbler assembler 
v2.6 (454 Life Sciences). These contigs were then scaffolded by 
alignment against the complete genome sequences of Y. cholerae 
MJ-1236 (7) and Y. cholerae Ol biovar El Tor strain N16961 (8) 
with the NCBI Genomic (NG) Aligner tool of the NCBI Genome 
Workbench v2.5.5. A further 38 gaps were closed by PGR ampli- 
fication and Sanger sequencing. This resulted in the assembly of 
the Y. cholerae GMll genome sequence into a total of 21 contigs. 
Protein-coding sequence (CDS) prediction and functional anno- 
tation of the predicted proteins were done using the Rapid Anno- 
tations using Subsystems Technology (RAST) Web server (9) be- 
fore manual curation was performed. 

The Y. cholerae G4222 contigs could be scaffolded into two 
distinct chromosomes, as is typical of Y. cholerae strains (10). 



Chromosome I consists of 14 contigs amounting to a total length 
of 3,139,654 bp, with an average G+C content of 47.72% and 
2,809 annotated CDS. Chromosome II consists of seven contigs 
with a total length of 1,061,058 bp and a G+C content of 46.88%, 
with 1,051 CDS annotated. The chromosome sizes and G+C 
compositions correlate well with those of other Y. cholerae strains 
(6, 8). An ~150-kb integrative conjugative element (ICE), belong- 
ing to the SXT family, is located on chromosome I of V. cholerae 
G4222 and carries the genes involved in multiple-drug resistance. 
Given that an African origin for SXT-related ICEs in Y. cholerae 
strains has been proposed, the genome of Y. cholerae G4222 pro- 
vides farther opportunity to investigate the evolution of SXT ele- 
ments ( 1 1 ) . The strain might also provide insights into the biology 
of South African epidemic Y. cholerae strains. 

Nucleotide sequence accession numbers. This Whole Ge- 
nome Shotgun project has been deposited at DDBJ/EMBL/ 
GenBank under the accession no. ANNBOOOOOOOO. The version 
described in this paper is the first version, ANNBOIOOOOOO. 
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